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Abstract 

In  previous  work  [16],  we  give  a  type  system  that  guarantees  that  well-typed  multi¬ 
threaded  programs  are  possibilistically  noninterfering.  If  thread  scheduling  is  proba¬ 
bilistic,  however,  then  well-typed  programs  may  have  probabilistic  timing  channels.  We 
describe  how  they  can  be  eliminated  without  making  the  type  system  more  restrictive. 
We  show  that  well-typed  concurrent  programs  are  probabilistically  noninterfering  if 
every  total  command  with  a  guard  containing  high  variables  executes  atomically.  The 
proof  uses  the  notion  of  a  probabilistic  state  of  a  computation  from  Kozen’s  work  in 
the  denotational  semantics  of  probabilistic  programs  [11]. 1 


1  Introduction 

This  work  is  motivated  by  applications  of  mobile  code  where  programs  are  downloaded, 
as  needed,  and  executed  on  a  trusted  host.  Here  a  host  may  have  sensitive  data  that 
downloaded  code  may  need,  and  we  want  assurance  that  they  are  not  leaked  by  the  code. 
In  some  cases,  the  best  approach  may  simply  be  to  forbid  any  access  to  the  sensitive  data, 
using  some  access  control  mechanism.  But  often  the  code  will  legitimately  need  to  access 
the  data  to  be  useful,  and  in  this  case,  we  must  be  sure  the  code  does  not  leak  it. 

Specifically,  this  paper  is  concerned  with  identifying  conditions  under  which  concurrent 
programs,  involving  high  (private)  and  low  (public)  variables,  can  be  proved  free  of  infor¬ 
mation  flows  from  high  variables  to  low  variables.  In  previous  work  [16],  we  developed 
a  type  system  that  ensures  that  well-typed  multi-threaded  programs  have  a  possibilistic 
noninterference  property.  Possibilistic  noninterference  asserts  that  the  set  of  possible  final 
values  of  low  variables  is  independent  of  the  initial  values  of  high  variables.  Hence,  if  we 
run  such  a  program  and  observe  some  final  values  for  its  low  variables,  then  we  cannot 

1  This  is  an  expanded  version  of  a  paper  that  appeared  in  the  Proceedings  of  the  11th  IEEE  Computer 
Security  Foundations  Workshop,  pages  34-43,  June  1998.  Appears  in  Journal  of  Computer  Security,  Vol  7, 
No  2-3  pp.  231-253. 

1Tliis  material  is  based  upon  activities  supported  by  DARPA  and  by  the  National  Science  Foundation 
under  Agreement  Nos.  CCR-9612176  and  CCR-9612345. 
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conclude  anything  about  the  initial  values  of  its  high  variables.  However,  that  work  re¬ 
lies  on  a  purely  nondeterministic  thread  scheduler,  whose  implementation  requires  what 
Dijkstra  calls  an  “erratic  daemon”  [3].  More  realistically,  we  would  expect  a  mechanically- 
implemented  scheduler  to  be  probabilistic.  But  with  a  probabilistic  scheduler,  a  possibilistic 
noninterference  property  is  no  longer  sufficient — indeed,  it  now  becomes  easy  to  construct 
well-typed  programs  with  probabilistic  timing  channels. 

We  illustrate  this  point  with  an  example.  Suppose  that  x  is  a  high  variable  whose 
value  is  either  0  or  1,  y  is  a  low  variable,  and  c  is  a  command  that  does  not  assign  to  y. 
Also  assume  that  c  takes  many  steps  to  complete.  Consider  the  following  multi-threaded 
program: 

•  Thread  cc. 

if  x  =  1  then  (c;  c); 

y  ■=  i 

•  Thread  (3: 

c; 

y  :=  0 

Assuming  that  thread  scheduling  is  purely  nondeterministic,  this  program  satisfies  possi¬ 
bilistic  noninterference — we  can  see  by  inspection  that  the  final  value  of  y  can  be  either  0 
or  1,  regardless  of  the  initial  value  of  x. 

But  suppose  the  two  threads  are  actually  scheduled  probabilistically,  by  flipping  a  coin 
at  each  step  to  decide  which  thread  to  run.  Then  the  threads  run  at  roughly  the  same  rate 
and,  as  a  result,  the  value  of  x  ends  up  being  copied  into  y  with  high  probability.  That  is, 
a  change  in  the  initial  value  of  x  changes  the  probability  distribution  of  the  final  values  of 
y.  Hence  this  program  exhibits  a  probabilistic  timing  channel. 

Note  that  with  c  suitably  chosen,  the  program  is  well  typed  in  the  type  system  of  [16], 
and  hence  that  system  cannot  guard  against  such  channels.  One  approach  would  be  to 
modify  the  type  system  so  that  all  guards  of  conditionals  are  low.  In  this  case,  the  program 
is  no  longer  well  typed.  But  such  a  restriction  is  likely  too  burdensome  in  practice. 

Instead,  our  strategy  is  to  allow  the  use  of  high  variables  in  guards,  but  to  require  that 
the  resulting  timing  variations  be  masked.  We  accomplish  this  by  imposing  a  simple  syn¬ 
tactic  restriction:  every  conditional  whose  guard  contains  high  variables  must  be  executed 
atomically.  This  is  accomplished  by  wrapping  such  a  conditional  with  a  new  command, 
called  protect  [15],  that  guarantees  that  the  conditional  will  be  executed  atomically  in  a 
multi-threaded  environment;  the  result  is  that  timing  variations  (based  on  which  branch 
of  the  conditional  is  selected)  will  not  be  observable  internally.2  In  general,  we  require 
that  any  total  guarded  command  be  protected  if  its  guard  contains  high  variables.  In  the 
remainder  of  the  paper,  we  formally  establish  the  soundness  of  our  restriction  by  proving 
that  protected  well-typed  programs  satisfy  a  probabilistic  noninterference  property,  which 
says  that  the  joint  probability  distribution  of  final  values  of  low  variables  is  independent  of 
the  initial  values  of  high  variables. 

2Later,  in  Sections  6  and  8,  we  make  some  remarks  about  external  observations. 
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2  Syntax  and  semantics 


Threads  are  commands  in  the  following  deterministic  imperative  language: 


( phrases ) 

P 

::=  e  c 

( expressions ) 

e 

::=  a;  n  ei  +  e2  ei  —  e2 

( commands ) 

c 

::=  x  :=  e  C\\C2  skip 

if  e  then  ci  else  C2  | 
while  e  do  c  | 
for  e  do  c  | 
protect  c 

Metavariable  x  ranges  over  identifiers  and  n  over  integer  literals.  Integers  are  the  only 
values;  we  use  0  for  false  and  nonzero  for  true.  Note  that  expressions  do  not  have  side 
effects,  nor  do  they  contain  partial  operations  like  division.  The  command  for  e  do  c  is 
executed  by  evaluating  e,  yielding  an  integer,  and  then  executing  c  that  many  times. 

A  structural  operational  semantics  for  our  language  is  given  in  Figure  1.  Programs 
are  executed  with  respect  to  a  memory  //.,  which  is  a  mapping  from  identifiers  to  integers. 
Because  of  our  restrictions  on  expressions,  we  know  that  an  expression  e  is  always  well 
defined  in  a  memory  /x  provided  that  every  free  variable  of  e  is  in  dom(n);  this  will  always 
be  the  case  if  e  is  well  typed.  Also,  we  assume  for  simplicity  that  expressions  are  evaluated 
atomically.3  Thus  we  simply  extend  a  memory  /x  in  the  obvious  way  to  map  expressions  to 
integers,  writing  /i(e)  to  denote  the  value  of  expression  e  in  memory  /x. 

The  semantics  defines  a  sequential  transition  relation  — >•  on  configurations.  A  config¬ 
uration  w  is  either  a  pair  (c, /x)  or  simply  a  memory  /x.  In  the  first  case,  c  is  the  command 
yet  to  be  executed;  in  the  second  case,  the  command  has  terminated,  yielding  final  memory 
/i.  We  define  the  reflexive  transitive  closure  — >*  in  the  usual  way.  First  k,  — >°  k,  for 
any  configuration  k,  and  k  — >k  k",  for  k  >  0,  if  there  is  a  configuration  n'  such  that 
n  — >k~l  k'  and  k'  — >  k" .  Then  n  — »*  k'  if  n  — >k  n'  for  some  k  >  0. 

Protected  sections  have  a  noninterleaving  semantics,  expressed  by  rule  atomicity.  The 
rule  says  that  a  protected  section  can  execute  in  one  sequential  step  even  though  the 
command  being  protected  may  take  more  than  one  step  to  terminate  successfully.  In  effect, 
this  captures  the  idea  of  disabling  scheduling  events  (interrupts)  while  a  protected  section 
executes  and  guarantees  that  at  most  one  thread  will  be  in  a  protected  section  at  any  time. 
For  this  reason,  we  prohibit  while  loops  in  protected  sections.  This  eliminates  any  risk  of  a 
thread  failing  to  terminate  while  in  a  protected  section,  causing  all  other  threads  to  freeze, 
and  also  simplifies  the  semantics.  Of  course  for  loops  can  be  protected  and,  in  fact,  must 
be  protected  in  some  situations  as  we  shall  see.  We  also  assume  that  protected  sections  are 
not  nested.  This  is  not  a  practical  limitation  and  it  simplifies  our  proofs. 

An  interleaving  semantics  for  multi-threaded  programs  is  given  by  the  GLOBAL  rules  in 
Figure  1.  As  in  [16],  we  take  a  concurrent  program  to  be  a  set  O  of  commands  that  run 
concurrently.  The  set  O  is  called  the  thread  pool  and  it  does  not  grow  during  execution.  We 
represent  O  as  a  mapping  from  thread  identifiers  (a,  j3,  . . . )  to  commands.  We  assume  that 

3The  noninterference  property  we  prove  does  not  depend  on  atomicity  here  unless  the  time  it  takes  to 
evaluate  an  expression  depends  on  the  values  of  high  variables. 
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(update) 


x  E  dom(n) 

{x  :=  e,  //)  — >  n[x  :=  //(e)] 


(sequence)  (ci,/t)  — )•  //' 

(ci;c2,/i)  — >  (c2,  ///) 

(ci,/i)  — >  (4,/iQ 
(ei;c2,/i)  — >  (c[;  c2,  li') 

(no-op)  (skip,//) — >  // 

(branch)  //(e)  7^  0 

(if  e  then  ci  else  c2,/i)  — >  (ci,/z) 

/*(e)  =  0 _ 

(if  e  then  c\  else  c2,//)  — >  (c2,//) 

(loop)  //(e)  =  0 

(while  e  do  c,  //)  — >■  // 

/*(e)  7^  0 _ 

(while  e  do  c,  //)  — >  (c;  while  e  do  e, //) 

(iterate)  //(e)  <  0 

(for  e  do  c,  //)  — >•  // 

//(e)  >  0 

(for  e  do  c,  //)  — >  (c;for  //(e)  —  1  do  c,  //) 

(atomicity)  (c,  //)  — >*  //' 

(protect  c,  //)  — >•  //' 

(global)  0(a)  =  c 

(c,  //)  — >  //' 

p  =  VIgl _ 

(O,  //)=4>(0  -  a,//') 

0(a)  =  c 
(c,/z)  — >  (c',//') 

p  =  VI0! _ 

(O,  //)=^(0[a  :=  c'],//') 

Figure  1:  Sequential  and  concurrent  transition  semantics 
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all  threads  share  a  single  global  memory  /t.  through  which  the  treads  can  communicate.  A 
pair  (O,  i-i).  consisting  of  a  thread  pool  and  a  shared  memory,  is  called  a  global  configuration. 
The  global  rules  let  us  prove  judgments  of  the  form 

(0,^(0  V). 

This  asserts  that  the  probability  of  going  from  (O,  p)  to  (O' ,  ///)  is  p.  The  first  two  GLOBAL 
rules  specify  the  global  transitions  that  can  be  made  by  a  nonempty  thread  pool.  A 
scheduling  event  occurs  after  a  single  sequential  step  of  execution.  This  coupled  with  the 
atomicity  rule  ensures  atomic  execution  of  protected  sections.  Note  that  O  —  a  denotes 
the  thread  pool  obtained  by  removing  thread  a  from  O ,  and  0[a  :=  c']  denotes  the  thread 
pool  obtained  by  updating  the  command  associated  with  a  to  c! .  Note  that  the  rules  pre¬ 
scribe  a  uniform  probability  distribution  for  the  scheduling  of  threads.  (Actually,  we  could 
use  any  fixed  distribution  for  thread  scheduling;  we  use  a  uniform  distribution  simply  for 
simplicity.)  The  third  GLOBAL  rule,  which  deals  with  an  empty  thread  pool,  is  introduced 
to  accommodate  our  concurrent  program  execution  model  in  which  a  concurrent  program 
is  represented  by  a  discrete  Markov  chain  [4],  The  states  of  the  Markov  chain  are  global 
configurations  and  the  transition  matrix  is  governed  by  =>. 

3  The  type  system 

The  types  of  the  system  are  stratified  into  data  and  phrase  types: 

( data  types)  r  ::=  L  \  H 

( phrase  types)  p  ::=  r  |  r  var  |  r  cmd 

For  simplicity,  we  limit  the  security  classes  here  to  just  L  (low)  and  H  (high);  it  is  possible 
to  generalize  to  an  arbitrary  partial  order  of  security  classes. 

The  rules  of  the  type  system  are  given  in  Figure  2.  They  extend  the  system  of  [16] 
with  rules  for  protect  and  for.  The  rules  allow  us  to  prove  typing  judgments  of  the 
form  7  b  p  :  p  as  well  as  subtyping  judgments  of  the  form  p\  C  p2.  Here  7  denotes  a 
variable  typing,  mapping  variables  to  phrase  types  of  the  form  r  var.  Note  that  guards  of 
conditionals  and  for  loops  may  contain  high  variables,  unlike  the  guards  of  while  loops. 

The  effect  of  these  typing  rules  is  to  impose  constraints  on  the  various  constructs  of  the 
language;  these  constraints  can  be  summarized  as  follows: 

•  In  an  assignment  x  :=  e ,  if  x  is  low,  then  e  must  contain  no  high  variables. 

•  The  guard  of  a  while  loop  must  contain  no  high  variables. 

•  In  a  conditional  if  e  then  c  else  c' ,  if  the  guard  e  contains  any  high  variables,  then 
the  branches  c  and  c!  must  not  contain  any  while  loops  or  assignments  to  low  vari¬ 
ables.  A  similar  constraint  applies  to  for  loops. 

Definition  3.1  We  say  that  p  is  well  typed  under  7  if  7  b  p  :  p  for  some  p.  Also,  O  is  well 
typed  under  7  if  0(a)  is  well  typed  wider  7  for  every  a  E  dom(O). 
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(int) 

7  b  n  :  L 

(r-val) 

j(x)  =  t  var 

7  h  x  :  r 

(sum) 

7  b  ej  :  r,  7  b  e2  :  r 

7  h  ei  +  e2  :  T 

(assign) 

j(x)  =  t  var ,  7  h  e  :  r 

7  b  x  :=  e  :  r  cmd 

(compose) 

7  h  ej  :  r  cmd ,  7  h  c2  :  r  cmd 

7  h  ci;c2  :  r  cmd 

(skip) 

7  h  skip  :  Ff  cmd 

(if) 

7  h  e  :  r,  7  h  ej  :  r  cmd,  7  h  c2 
7  h  if  e  then  ci  else  c2  :  r  cmd 

(while) 

7  h  e  :  L,  7I-  c  :  I  cmd 

7  h  while  e  do  c  :  L  cmd 

(for) 

7  h  e  :  r,  7  b  c  :  r  cmd 

7  b  for  e  do  c  :  r  cmd 

(protect) 

7  b  c  :  r  cmd 

7  b  protect  c  :  r  cmd 

(base) 

L  C  H 

(reflex) 

■cs 

in 

Ti 

(CMD“) 

Tl  C  T2 

r2  cmd  C  n  cmd 

(subtype) 

-2 

T 

in 

10 

7  ^  P  ■  P2 


Figure  2:  Typing  and  subtyping  rules 


t  cmd 
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1)  ({a  :=  while  l  =  0  do  skip,  [3  :=  ( l  :=  1)},  [l  :=  0]) 

2)  ({a  :=  while  l  =  0  do  skip},  [l  :=  1]) 

3)  ({ a  :  =  skip;  while  1  =  0  do  skip,  [3  :=  (l  :  =  1)},  [Z  :=  0]) 

4)  ({a  :=  skip;  while  l  =  0  do  skip},  [l  :=  1]) 

5)  ({  },  [l  :=  1]) 


Figure  3:  States  of  Markov  chain 

4  Probabilistic  states 

Informally,  our  formulation  of  probabilistic  noninterference  is  a  sort  of  probabilistic  lock 
step  execution  statement.  Under  two  memories  that  may  differ  on  high  variables,  we  want  to 
know  that  the  probability  that  a  concurrent  program  can  reach  some  global  configuration 
under  one  of  the  memories  is  the  same  as  the  probability  that  it  reaches  an  equivalent 
configuration  under  the  other. 

A  concurrent  program  0  executing  in  a  memory  /t  can  be  viewed  as  a  discrete  Markov 
chain  [4],  The  states  of  the  Markov  chain  are  all  the  global  configurations  reachable  from  the 
initial  state  ( 0,n )  under  =£-,  and  the  transition  matrix  T  (sometimes  called  the  stochastic 
matrix)  is  given  by 

T((0i,m),(02,/i2))  =  |  q’ 

For  example,  consider  the  following  program: 

O  =  {a  :=  while  I  =  0  do  skip,  (3  :=  (l  :=  1)} 

Starting  with  memory  [l  :  =  0],  the  program  can  get  into  at  most  five  different  configurations, 
and  so  its  Markov  chain  has  five  states,  given  in  Figure  3.  For  instance,  starting  in  state 
1  we  might  run  thread  a  for  a  step,  taking  us  to  state  3.  (This  follows  from  the  second 
global  rule  and  the  second  loop  rule  of  Figure  1.)  Alternatively,  from  state  1  we  might 
run  thread  / 3  for  a  step,  taking  us  to  state  2.  (This  follows  from  the  first  global  rule  and 
the  update  rule  of  Figure  1.)  Furthermore,  the  global  rules  specify  that  the  probability 
of  each  of  these  transitions  is  1/2  since  there  are  two  threads  in  the  thread  pool.  No  other 
transitions  are  possible  from  state  1. 

In  this  way,  we  can  determine  the  probability  of  going  from  each  of  the  five  states  to  any 
other  state.  These  probabilities  are  collected  in  the  transition  matrix  T  given  in  Figure  4. 
Note  that  from  state  5  we  go  to  state  5  with  probability  1  because  of  the  third  global  rule 
of  Figure  1.  Since  no  other  state  is  reachable  from  state  5,  it  is  called  an  absorbing  state 

[4]- 

The  set  of  Markov  states  may  be  countably  infinite — a  simple  example  is  a  nonterminat¬ 
ing  loop  that  increments  a  variable.  In  this  case,  the  transition  matrix  is  also  countably  infi¬ 
nite.  In  general,  if  T  is  a  transition  matrix  and  T((0,  /i),  (O',  ///))  >  0,  for  some  global  con¬ 
figurations  (0,n)  and  (0',n'),  then  either  O  is  nonempty  and  T ((O ,  n) ,  (O' ,  /F ))  =  1/|0|, 
or  else  O  and  O'  are  empty,  /t  =  //.  and  T((0,n),  (O' ,ji'))  =  1. 


if  (0\,  ni)=>(02,  H2) 
otherwise 
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1 

2 

3 

4 

5 

1 

0 

1/2 

1/2 

0 

0 

2 

0 

0 

0 

0 

1 

3 

1/2 

0 

0 

1/2 

0 

4 

0 

1 

0 

0 

0 

5 

0 

0 

0 

0 

1 

Figure  4:  Transition  matrix 


Kozen  gives  a  denotational  semantics  of  probabilistic  programs  whereby  a  program  de¬ 
notes  a  mapping  from  one  probability  distribution  to  another  [11],  The  idea  of  transforming 
distributions  is  also  useful  in  an  operational  setting  such  as  ours.  Using  the  Markov  chain, 
we  model  the  execution  of  a  concurrent  program  deterministically  as  a  sequence  of  proba¬ 
bilistic  states: 

Definition  4.1  A  probabilistic  state  is  a  probability  measure  on  the  set  of  global  configu¬ 
rations. 

A  probabilistic  state  can  be  represented  as  a  row  vector  whose  components  must  sum 
to  1.  So  if  T  is  a  transition  matrix  and  s  is  a  probabilistic  state,  then  the  next  probabilistic 
state  in  the  sequence  of  such  states  modeling  a  concurrent  computation  is  simply  the  vector- 
matrix  product  sT.  For  instance,  the  initial  probabilistic  state  for  the  program  O  in  our 
preceding  example  is  ( 1  0  0  0  0).  It  indicates  that  the  Markov  chain  begins  in  state  1  with 
certainty.  The  next  state  is  given  by  taking  the  product  of  this  state  with  the  transition 
matrix  of  Figure  4,  giving  (0  1/2  1/2  0  0).  This  state  indicates  the  Markov  chain  can  be 
in  states  2  and  3,  each  with  a  probability  of  1/2.  Multiplying  this  vector  by  T,  we  get  the 
third  probabilistic  state,  (1/4  0  0  1/4  1/2);  we  can  determine  the  complete  execution 
in  this  way.  The  first  five  probabilistic  states  in  the  sequence  are  depicted  in  Figure  5. 
The  fifth  probabilistic  state  tells  us  that  the  probability  that  O  terminates  under  memory 
[l  :  =  0]  in  at  most  four  steps  is  7/8. 

We  remark  that  (0,[l  :=  0])  is  an  example  of  a  concurrent  program  that  is  proba¬ 
bilistically  total,  since  it  halts  with  probability  1.  But  it  is  not  nondeterministically  total, 
because  it  has  an  infinite  computation  path. 

Note  that  although  there  may  be  infinitely  many  states  in  the  Markov  chains  corre¬ 
sponding  to  our  programs,  the  probabilistic  states  that  arise  in  our  program  executions 
will  assign  nonzero  probability  to  only  finitely  many  of  them.  This  is  because  we  begin 
execution  in  a  single  global  configuration  ( 0,/j ),  and  we  branch  by  at  most  a  factor  of  k 
at  each  step,  where  k  is  the  number  of  threads  in  O.  If,  however,  we  were  to  extend  our 
language  with  a  random  number  generator,  which  returns  an  arbitrary  integer  with  respect 
to  some  probability  distribution,  then  we  would  have  to  consider  probabilistic  states  that 
give  nonzero  probabilities  to  an  infinite  number  of  global  configurations. 

With  probabilistic  states,  we  can  now  see  how  probability  distributions  can  be  sensitive 
to  initial  values  of  high  variables,  even  for  programs  that  have  types  in  the  system  of 
Figure  2.  Consider  the  example  from  Section  1,  where  c  is  instantiated  to  skip: 

a  :=  (if  x  =  1  then  skip;  skip);  y  :=  1, 

P  ■■=  (skip;  y  :=  0) 
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{({a  :=  while  1  =  0  do  skip,  [3  :=  ( l  :=  1)},  [l  :=  0])  :  1} 

| 

({ a  :=  while  I  =  0  do  skip},  [l  :=  1])  :  1/2, 

({a  :=  skip;  while  I  =  0  do  skip,  fj  :=  ( l  :=  1)},  [Z  :=  0])  :  1/2 

I 

f  ({  },[l  :=  1])  :  1/2,  1 

<  ({a  :=  while  Z  =  0  do  skip,  (3  :=  (Z  :=  1)},  [l  :=  0])  :  1/4,  > 

[  ({a  :=  skip;  while  Z  =  0  do  skip},  [Z  :=  1])  :  1/4  j 

I 

({  },[!:=  1])  :  1/2, 

({a  :=  skip;  while  I  =  0  do  skip,  (3  :=  (Z  :=  1)},  [l  :=  0])  :  1/8, 
({a  :=  while  Z  =  0  do  skip},  [Z  :=  1])  :  3/8 

I 

({},[!:=!])  =  7/8,  1 

<  ({«  :=  while  Z  =  0  do  skip,  (3  :=  (Z  :=  1)},  [l  :=  0])  :  1/16,  > 

({a  :=  skip;  while  Z  =  0  do  skip},  [l  :=  1])  :  1/16  j 


Figure  5:  A  probabilistic  state  sequence 


Each  thread  is  well  typed.  We  give  two  sequences  of  state  transitions,  assuming  the  obvious 
transition  semantics  for  if  e  then  c.  One  begins  with  x  equal  to  0  (Figure  6)  and  the  other 
with  x  equal  to  1  (Figure  7).  Notice  the  change  in  distribution  for  the  final  values  of  y  when 
the  initial  value  of  the  high  variable  x  changes.  For  instance,  the  probability  that  y  has  final 
value  1  when  x  equals  1  is  13/16,  and  falls  to  1/2  when  x  equals  0.  What  is  going  on  here 
is  that  the  initial  value  of  x  affects  the  amount  of  time  required  to  execute  the  conditional; 
this  in  turn  affects  the  likely  order  in  which  the  two  assignments  to  y  are  executed.  Now 
suppose  that  we  protect  the  conditional  in  this  example.  Then  the  conditional  (in  effect) 
executes  in  one  step,  regardless  of  the  value  of  x ,  and  so  the  sequence  of  transitions  for 
x  =  0  is  equivalent,  state  by  state,  to  the  sequence  of  transitions  for  x  =  1  (Figures  8  and 
9). 

5  Probabilistic  noninterference 

Now  we  establish  the  main  result,  that  a  well-typed  concurrent  program  is  probabilistically 
noninterfering  if  every  total  command  with  a  guard  containing  a  high  variable  executes 
atomically.  But  first  we  need  some  properties  of  our  deterministic  thread  language.  The 
key  property  needed  in  the  noninterference  proof  is  a  lockstep  execution  property  for  well- 
typed  threads  executed  under  equivalent  memories  (Lemma  5.7).  This  property  in  turn  de¬ 
pends  on  other  properties  of  well-typed  threads  in  our  system,  specifically,  Simple  Security, 
Confinement  and  Mutual  Termination.  Together  these  properties  assert  that  well-typed 
threads  respect  the  privacy  of  high  variables  in  the  deterministic  language. 

First  we  need  a  notion  of  memory  equivalence  which  basically  requires  agreement  on 
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{({a  :=  (if  x  =  1  then  skip;  skip);  y  :=  1,  /3  :=  (skip;  y  :=  0)},  [x  :=  0,  y  :=  0])  :  1} 

I 

[  {{a  :=  (if  x  =  1  then  skip;  skip);  y  :=  1,  (3  :=  y  :=  0},  [x  :=  0,  y  :=  0])  :  1/2,  1 
1  ({«  :=  y  :=  1,  /?  :=  (skip;  y  :=  0)},  [x  :=  0 ,y  :=  0])  :  1/2  j 


I 

({a  :=  (if  x  =  1  then  skip;  skip);  y  :=  1},  [x  :=  0,  y  :=  0])  :  1/4, 
({«  :=  y  :=  1,  P  '■=  V  ■■=  0},  [x  :=  0 ,y  :=  0])  :  1/2, 

({/?  :=  (skip;  y  :=  0)},  [x  :=  0 ,y  :=  1])  :  1/4 

I 

({«:=y  :=  1},  (rr  :=  0,  y  :=  0])  :  1/2,  1 

({/?  :=  y  :=0},[x:=0,y  :=  1])  :  1/2  J 

I 

({},[x:=0,y  :=1])  :  1/2,  1 
({},[x:=0,y  :=0])  :  1/2  / 


Figure  6:  Probabilistic  state  sequence  when  x  =  0 


{({a  :=  (if  x  =  1  then  skip;  skip);  y  :=  1,  (3  :=  (skip;  y  :=  0)},  \x  :=  l,y  :=  0])  :  1} 

I 

[  ({«  :=  (if  x  =  1  then  skip;  skip);  y  :=  1,  (3  :=  y  :=  0},  \x  :=  1,  y  :=  01)  :  1/2,  ) 

|  ({a  :=  (skip;  skip);  y  :=  1,  (3  :=  (skip;  y  :=  0)},  [x  :=  1,  y  :=  0])  :  1/2  J 

I 

({a  :=  (if  x  =  1  then  skip;  skip);  y  :=  1},  [x  :=  1,  y  :=  0])  :  1/4, 

({a  :=  (skip;  skip);  y  :=  1,  (3  :=  y  :=  0},  [x  :=  1,  y  :=  0])  :  1/2, 

({a  :=  skip;  y  :=  1,  (3  :=  (skip;  y  :=  0)},  [x  :=  1,  y  :=  0])  :  1/4 

I 

({a  :=  (skip;  skip);  y  :=  1},  [x  :=  1,  y  :=  0])  :  1/2, 

({a  :=  skip;  y  :=  1,  f3  :=  y  :=  0},  [x  :=  1,  y  :=  0])  :  3/8, 

({a  :=  y  :=  1,  (3  :=  (skip;  y  :=  0)},  [x  :=  1, y  :=  0])  :  1/8 

I 

({a  :=  skip;  y  :=  1},  [x  :=  1, y  :=  0])  :  11/16, 

({«  :=  y  :=  1,  f3  :=  y  :=  0},  [x  :=  1,  y  :=  0])  :  1/4, 

({/?  :=  (skip;  y  :=  0)},  [x  :=  1,  y  :=  1])  :  1/16 

I 

j  ({a  :=  y  :=  l},  [x  :=  1,  y  :=  0])  :  13/16,  1 

1  {{(3  ■=  V  ■=  0},  [x  :=  1,  y  :=  lj)  :  3/16  J 

I 

j  ({  },  [x  :=  1,  y  :=  1])  :  13/16,  1 
1  ({},[z:=l,y:=0])  :  3/16 


Figure  7:  Probabilistic  state  sequence  when  x  =  1 
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a  :=  (protect  if  x  =  1  then  skip;  skip);  y  :=  1, 
(5  :=  (skip;  y  :=  0) 

I 


p  [*  :=  0,1/  :=  0]J  :  l| 

,  [x  :=  0,  y  :=  0]^)  :  1/2, 


a  :=  (protect  if  x  =  1  then  skip;  skip);  y  :=  1 

P~y~  o 

({a  :=  y  :=  1,  (3  :=  (skip;  y  :=  0)},  [x  :=  0,y  :=  0])  :  1/2 

I 

({a  :=  (protect  if  x  =  1  then  skip;  skip);  y  :=  1},  [x  :=  0,  y  :=  0])  :  1/4, 
({a  :=  V  ■  =  1,  /?  :=  V  ~  0},  [x  :=  0,y  :=  0])  :  1/2, 

({/3  :=  (skip;  y  :  =  0)},  [x  :=  0 ,y  :=  1])  :  1/4 

I 

f  ({a  :=  y  :=  1},  [x  :=  0,  y  :=  0])  :  1/2,  1 
\  ({/3  :=  y  :=  0},  [x  :=  0,  y  :=  1])  :  1/2  / 

I 

f  ({},[x:=0,2/:=l])  :  1/2,  1 
I  ({  },[®  :=  0,1/  :=  0])  :  1/2  f 


Figure  8:  Probabilistic  state  sequence  when  x  =  0 


a  :=  (protect  if  x  =  1  then  skip;  skip);  y  :=  1, 
/?  :=  (skip;  y  :=  0) 

I 

a  :=  (protect  if  x  =  1  then  skip;  skip);  y  :=  1, 

/3  :=  y  ■=  0 


[x  :=  l,y  :=  0]  :  1 


,  [x  :=  1,  y  :=  0]  ]  :  1/2, 


l  ({a  :=  y  ■=  Is  /?  :=  (skip;  y  :=  0)},  [x  :=  l,y  :=  0])  :  1/2 


I 

({a  :=  (protect  if  x  =  1  then  skip;  skip);  y  :=  1},  [x  :=  1,  y  :=  0])  :  1/4, 
({a  :=  y  :=  1,  (3  :=  y  :=  0},  [x  :=  l,y  :=  0])  :  1/2, 

({(3  :=  (skip;  y  :=  0)},  [x  :=  1  ,y  :=  1])  :  1/4 

I 

({a  :=  y  ■=  1},  \x  :=  1,3/  :=  0])  :  1/2,  ) 

({(3  :=  y  :=  0},  [x  :=  l,y  :=  lj)  :  1/2  J 

I 

({  },  [x  r=  l,?/  :=  1])  :  1/2,  1 

({  },  [x  :=  1,  y  :=  0])  :  1/2  / 


Figure  9:  Probabilistic  state  sequence  when  x  =  1 
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contents  of  low  variables: 


Definition  5.1  Memories  p  and  v  are  equivalent  with  respect  to  variable  typing  7,  written 
//,~7u,  if  dom(p)  =  dom(u)  =  dom,{ 7)  and  / x(x )  =  v(x)  for  all  x  such  that  7(2;)  =  L  var . 

The  proofs  of  Simple  Security  and  Confinement  (Lemmas  5.1  and  5.5)  are  complicated 
somewhat  by  subtyping.  We  shall  assume,  without  loss  of  generality,  that  all  typing  deriva¬ 
tions  end  with  a  single  (perhaps  trivial)  application  of  the  subsumption  rule  subtype. 

Lemma  5.1  (Simple  Security)  J/yheT  then  j(x)  =  L  var  for  every  variable  x  in  e. 

Proof.  By  induction  on  the  structure  of  e.  Since  H  <2  L,  the  derivation  of  7  b  e  :  L  ends 
with  a  trivial  application  of  rule  SUBTYPE. 

1.  Case  x.  By  rule  R-VAL,  we  have  7(2:)  =  L  var. 

2.  Case  n.  The  result  holds  vacuously. 

3.  Case  ei  +  e^.  By  rule  SUM,  we  have  7  h  ej  :  L  and  7  b  e2  :  L.  By  induction, 
7(2:)  =  L  var  for  every  variable  x  in  e\  and  in  e->.  The  remaining  binary  operators 
are  handled  similarly. 


□ 

Next  we  consider  Confinement.  Its  proof  depends  on  the  next  three  lemmas.  The  first 
two  treat  the  behavior  of  sequential  composition,4  and  the  third  is  a  lemma  about  the 
termination  of  for  loops. 

Lemma  5.2  If  (ci ;  cy  p)  — P  p' ,  then  there  exist  k  and  p"  such  that  0  <  k  <  j,  (c\,p)  — >k 
p" ,  and  (C2,  p1')  — P~k  p' ■ 

Proof.  By  induction  on  j.  If  the  derivation  begins  with  an  application  of  the  first  SEQUENCE 
rule,  then  there  exists  p"  such  that  {c\,p)  — >  p "  and  (c\m,C2,p)  — >  (c2,p")  — P_1  p' ■  So 
we  can  let  k  =  1.  And,  since  j  —  1  >  1,  we  have  k  <  j. 

If  the  derivation  begins  with  an  application  of  the  second  sequence  rule,  then  there 
exists  ci  and  p%  such  that  ( c\,p )  — >  (dj,pi)  and  (ci;c2 ,p)  — >  (c[;c2,pi)  — p! . 
By  induction,  there  exists  k  and  p"  such  that  0  <  k  <  j  -  1,  (ci,/ii)  — >k  p",  and 
(c2,p")  — P~1~k  n' .  Hence  (ci,p)  — >k+1  p"  and  (c2,p")  — p' .  And  0  <  k+ 1  <  j. 
□ 

Lemma  5.3  If  ( c\,p )  — P  p'  and  ( C2,p ')  — >k  p" ,  then  (ci  ;  c.2- p )  — P+k  p" ■ 

4Lemmas  5.2  and  5.3  are  necessitated  by  our  use  of  a  small-step  transition  semantics  for  threads.  They 
would  be  unnecessary  in  an  operational  semantics  that  talks  about  complete  evaluations  as  in,  say,  a  natural 
semantics.  But  a  natural  semantics  does  not  meet  our  needs,  because  we  model  concurrent  program  execu¬ 
tion  as  a  sequence  of  probabilistic  states  where  configurations  represent  intermediate  steps  of  an  execution. 
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Proof.  By  induction  on  j.  If  j  =  1  then  by  the  first  SEQUENCE  rule,  (ei;c2,/u)  — > 
(c2, //')  — >k  p"  ■  Hence  (ci;c2,/x)  — >1+k  p" . 

If  3  7  1  then  there  exist  c i  and  such  that  (ci,  p^  ^  (c-^  .  hi )  ^  h  *  By  induction, 

(c'1;C2Shi)  — p~1+k  /j".  And,  by  the  second  sequence  rule,  (ci;c2,h)  — >  (c^;  c2, /ii). 
Hence  (ci;c2,/u)  — P+k  p" .  □ 

Definition  5.2  A  command  c  is  total  under  7  if  for  all  p  such  that  dom(n)  =  dom( 7), 
there  exists  p'  such  that  (c,p)  — >*  p' . 

Lemma  5.4  Suppose  p(e)  is  defined  and  dom(p)  =  dom( 7).  If  c  is  total  wider  7  then 
there  exists  p'  such  that  (for  e  do  C.  p )  — >*  p! . 

Proof.  Induction  on  p(e).  If  p(e)  <  0  then  (for  e  do  c,p)  — >  p  so  let  p'  =  p.  Now 
suppose  p(e)  >  0.  Since  c  is  total  under  7  and  dom(p)  =  dom( 7),  there  exists  p" 
such  that  (c, p)  — >*  p" .  Further,  dom(p")  =  dom(p)  =  dom (7)  and  p(p(e)  —  1)  is 
trivially  defined  since  p(e)  —  1  is  an  integer.  So  by  induction,  there  exists  p'  such  that 
(for  p(e)  —  1  do  c,  p")  — >*  p' .  Then  (c;  for  p(e)  —  1  do  c,p)  — p!  by  Lemma  5.3.  And 
(for  e  do  c,  p)  — >  (c;for  p(e)  —  1  do  c,p)  by  the  second  iterate  rule  since  p(e)  >  0. 
Hence  (for  e  do  c,  p)  — p' .  □ 

Lemma  5.5  (Confinement)  If  7  h  c  :  H  cmd  ,  then  7(2:)  =  H  var,  for  every  variable  x 
assigned  to  in  c,  and  c  is  total  under  7. 

Proof.  By  induction  on  the  structure  of  c.  Since  L  cm.d  2  H  cmd,  the  derivation  of  7  h  c  : 
H  cm.d  ends  with  a  trivial  application  of  rule  subtype. 

1.  Case  x  :=  e.  By  rule  ASSIGN,  7(2:)  =  H  var.  If  dom(p)  =  dom( 7)  then  x  G  dom(p), 
and  p(e)  is  defined  since  e  is  well  typed  under  7  by  rule  ASSIGN.  So  by  rule  update, 
we  have  (x  :=  e,p)  — >  p[x  :=  p{e)]. 

2.  Case  skip.  The  result  follows  immediately  from  rule  NO-OP. 

3.  Case  ci;c2.  By  rule  COMPOSE,  we  have  7  h  C|  :  H  cmd  and  7  h  c2  :  H  cm.d.  By 

induction,  we  have  j(x)  =  H  var  for  every  variable  x  assigned  to  in  c\  and  in  c2,  and 

ci  and  c2  are  total  under  7.  So  if  dom(p)  =  dom (7)  then  there  exists  p"  such  that 

(ci,/i)  — >*  p" .  Now  dom(p")  =  dom(p),  so  there  exists  p'  such  that  ( c2, // ")  — >*  p' . 
Hence  (ci;c2,/i)  — >*  p!  by  Lemma  5.3. 

4.  Case  if  e  then  ci  else  c2.  By  rule  if,  we  have  7  Fuci  :  H  cm.d  and  7  h  c2  :  H  cmd. 
By  induction,  we  have  7(2;)  =  H  var  for  every  variable  x.  assigned  to  in  ci  and  in  c2, 
and  ci  and  c2  are  total  under  7.  If  dom(p)  =  dom( 7)  then  p(e)  is  defined  since  e 
is  well  typed  under  7.  If  p(e)  7^  0  then  (if  e  then  ci  else  c2 .  p )  — >  ( c\,p )  by  rule 
branch.  And  since  ci  is  total  under  7,  there  exists  p'  such  that  ( c\,p )  — >*  p' .  So 
(if  e  then  ci  else  c2,  p)  — p' .  The  case  when  p(e)  =  0  is  similar. 

5.  Case  for  e  do  ci.  By  rule  FOR,  we  have  7  h  ci  :  H  cm.d.  By  induction,  7(2:)  =  H  var 
for  every  variable  x.  assigned  to  in  ci,  and  ci  is  total  under  7.  If  dom.(p)  =  dom.{ 7) 
then  p(e)  is  defined  since  e  is  well  typed  under  7.  Then  there  exists  p'  such  that 
(for  e  do  c\.p)  — >*  p'  by  Lemma  5.4. 
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6.  Case  protect  c\ .  By  rule  protect,  we  have  7  b  c\  :  H  cmd.  By  induction,  p(x)  = 
H  var  for  every  variable  x  assigned  to  in  <7,  and  ci  is  total  under  7.  So  if  dom(p)  = 
dom( 7)  then  there  exists  p'  such  that  {ci,p)  — >*  ///.  Hence  (protect  c\,p)  — >  p' 
by  rule  ATOMICITY. 

□ 

The  Confinement  Lemma  says  that  any  thread  that  can  be  given  type  H  cmd  under  a 
typing  7,  terminates  successfully  when  run  in  a  memory  /t  having  the  same  domain  as  7 
and  does  not  change  the  contents  of  any  low  variables  of  /t. 

Now  we  establish  the  Mutual  Termination  property.  It  states  that  it  is  impossible  for 
a  well- typed  thread  to  terminate  under  one  memory  and  run  forever  under  an  equivalent 
memory. 

Lemma  5.6  (Mutual  Termination)  Suppose  c  is  well  typed  under  7  and  protect  free, 
and  that  ^~7za  If  (c, /i)  — p!  then  there  is  a  v'  such  that  (c,  v)  — >*  v'  and 

Proof.  By  induction  on  the  length  of  the  derivation  of  (c,/x)  — p' .  We  consider  the 
different  forms  of  c: 

1.  Case  x  :=  e.  Since  c  is  well  typed,  we  have  x  E  dom,{  7).  Since  dom(p)  =  dom(i')  = 
domfj),  we  have  (c,  p)  — >•  p[x  :=  p{e)]  and  (c,n)  — >•  n[x  :=  v(e)].  If  j(x)  =  L  var 
then  by  rule  ASSIGN,  we  have  7  b  e  :  L.  So  by  Simple  Security,  7 (y)  =  L  var  for  every 
variable  y  in  e.  Hence  p(e)  =  p(e),  and  so  p[x  :=  p{e)]^1v[x  :=  v{e)\.  If,  instead, 
7(2:)  =  H  var  then  trivially  p[x  :=  p(e)]™^v[x  :=  n(e)]. 

2.  Case  skip.  The  result  follows  immediately  from  rule  NO-OP. 

3.  Case  ci;c2-  If  (ci;c2,/i)  — P  p!  then  by  Lemma  5.2  there  exist  k  and  p"  such  that 

0  <  k  <  j,  {c\,p)  — >k  p"  and  (c2,p")  — P~k  p' ■  By  induction,  there  exists  u"  such 
that  (c\,v)  — >*  v"  and  So  by  induction  again,  there  exists  v'  such  that 

(c2,pw)  — >*  v'  and  p'^-pS .  Finally,  (ci;c2,p)  — >*  n'  by  Lemma  5.3. 

4.  Case  while  e  do  q.  Since  7  h  e  :  L,  we  know  by  Simple  Security  that  p(e)  =  v(e). 
Suppose  p(e)  =  0.  Then  (while  e  do  c\,p)  — >  p  and  also  (while  e  do  c\,v)  — >  u. 
since  p(e)  =  v(e). 

If  p(e)  /  0  then  (while  e  do  c\,p)  — >  (ci;  while  e  do  c\,p)  — p! .  By  induction, 
there  exists  v'  such  that  (ci;  while  e  do  c\,v)  — u'  and  p! ~7p' And  since  v(e)  7^ 
0,  (while  e  do  c\,v)  — >  (ci;  while  e  do  c\,v)  — >*  v' . 

5.  Case  if  e  then  ci  else  C2-  If  7  b  e  :  L  then  p(e)  =  v(e)  by  Simple  Security.  If  p(e)  7^ 
0  then  (if  e  then  ci  else  C2,p)  — >  (ci  ,p)  — p' .  By  induction,  there  exists  u'  such 
that  (ci,zx)  — u'  and  ///~7;/.  Since  u(e)  7^  0,  we  have  (if  e  then  ci  else  C2,n)  — > 
(ci,  u)  — >*  v' .  The  case  when  p(e)  =  0  is  similar. 

If  instead  7  \f  e  :  L  then  by  rule  if,  and  the  fact  that  c  is  well  typed,  we  have 
7  h  if  e  then  ci  else  C2  :  H  cmd.  Then  by  Confinement,  /.i/~7/i  and  there  exists  v' 
such  that  (if  e  then  ci  else  C2,p)  — v'  and  z/~7za  So  p 
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6.  Case  for  e  do  q.  If  7  b  e  :  L  then  /x(e)  =  u(e)  by  Simple  Security.  So  if  p(e)  <  0 
then  (for  e  do  c|./x)  — >  p  and  since  / x(e )  =  u(e),  (for  e  do  c\ty)  — >  u.  If  instead 
/x(e)  >  0,  then  (for  e  do  C\,p)  — >  (ci;for  /4(e)  —  1  do  C\,p)  — >*  /x'.  By  induction, 
there  exists  v'  such  that  (ci;for  /x(e)  —  1  do  c\ .  u)  — v'  and  And  since 

/x(e)  =  i'(e),  (for  e  do  c\,u)  — >  (ci;for  u(e)  —  1  do  c\,v)  — >*  v' . 

If  instead  7  \f  e  :  L  then  since  c  is  well  typed,  we  have  7  b  for  e  do  7  :  H  cm.d 
by  rule  FOR.  By  Confinement,  there  exists  v'  such  that  (for  e  do  d,  u)  — >*  v'  and 
i/~7za  Also,  /447~t/4  by  Confinement,  so  /x/~7z/. 

□ 

Definition  5.3  A  command  is  protected  under  7  if  every  conditional  and  for  loop  is  en¬ 
closed  by  a  protect  whenever  the  guard  contains  a  variable  x  such  that  7(2;)  =  H  var. 
Also,  O  is  protected  under  7  if  0(a)  is  protected  under  7  for  every  a  G  dom(O). 

Now  we  can  show  that  if  we  execute  a  well-typed,  protected  command  c  in  two  equivalent 
memories  then  the  two  executions  proceed  in  lock  step. 

Lemma  5.7  (Lock  Step  Execution)  Suppose  c  is  well  typed  under  7  and  protected,  and 
that  p^-(v.  // (c,  p)  — >  (cV),  then  there  exists  v'  such  that  (c,  u)  — >  (c',z/)  and  /4/~7z^/. 
And  if  (c,i 1)  — >  n' ,  then  there  exists  v'  such  that  {c,u)  — >  v'  and  /x/~7z/. 

Proof.  By  induction  on  the  structure  of  c. 

1.  Case  x  :=  e.  Since  c  is  well  typed,  x  G  dom( 7).  Since  dom(p)  =  domfv)  =  dom{ 7), 
we  have  (c,  /x)  — >  p[x  :=  /x(e)]  and  (c,  u)  — >  v[x  :=  n(e)\.  If  7(2;)  =  L  var  then 
by  rule  ASSIGN,  we  have  7  b  e  :  L.  So  by  Simple  Security,  7 (y)  =  L  var  for  every 
variable  y  in  e.  Hence  /x(e)  =  v{e),  and  so  p[x  :=  /x(e)]~7/v[x  :=  zx(e)].  If,  instead, 
7(2;)  =  H  var  then  trivially  p[x  :=  n{e)]^1v[x  :=  n(e)\. 

2.  Case  skip.  The  result  follows  immediately  from  rule  NO-OP. 

3.  Case  ci;c2-  If  (ci ; c-2- /' )  — >  (C2 A1')  then  by  the  first  sequence  rule,  we  have 
(ci,/4)  — >  ft! .  By  induction,  there  exists  v'  such  that  (ci,u)  — >  v'  and  /x'~7 u' . 
And  therefore  (ci;  C2,  v)  — >  (02,  n1)  by  the  first  SEQUENCE  rule. 

If,  instead,  (ci;c2,/x)  — >  (c/1;C2,  p1)  then  (ci,/x)  — >  {c\ ,  /A)  by  the  second  SEQUENCE 
rule.  By  induction,  there  exists  v'  such  that  (ci,u)  — >  (c\  ,v')  and  ^,~7u/.  Hence 
(ci ;  C2,  v)  — >  ( c\ ;  C2,  v')  by  the  second  SEQUENCE  rule. 

4.  Case  while  e  do  c\.  Since  7  b  e  :  L,  we  know  by  Simple  Security  that  /x(e)  =  u(e). 
Suppose  /x(e)  =  0.  Then  (while  e  do  ci,/x)  — >  /x  and  so  (while  e  do  ci,u)  — >  v 
since  /x(e)  =  u(e).  If  p(e)  7^  0  then  (while  e  do  ci , /x)  — >  (ci;while  e  do  cj,/x). 
And  since  u(e)  7^  0,  (while  e  do  ci,  u)  — >  (ci;  while  e  do  Ci,u). 

5.  Case  if  e  then  ci  else  C2.  Since  c  is  protected,  we  have  7  b  e  :  L.  Therefore,  p(e)  = 
v(e)  by  Simple  Security.  So  if  /x(e)  7^  0  then  (if  e  then  ci  else  C2,/x)  — >  (ci,/x). 
And  since  n(e)  7^  0,  we  have  (if  e  then  ci  else  C2,u)  — >  (ci5u)-  The  case  when 
gfe)  =  0  is  similar. 
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6.  Case  for  e  do  c\.  Since  c  is  protected,  7  h  e  :  L,  and  as  above  p(e)  =  v(e).  So  if 
//,(e)  <  0  then  (for  e  do  c\,p)  — >  p  and  since  p(e)  =  u(e),  (for  e  do  c\,v)  — >  v.  If 
p(e)  >  0,  then  (for  e  do  c\,p)  — >  (ci;for  p(e)  —  1  do  c\,p).  And  since  v(e)  >  0, 
(for  e  do  ci,  v)  — >  (c\;  for  u(e)  —  1  do  ci,  v). 

7.  Case  protect  ci.  By  rule  atomicity,  (ci , //)  — p! .  Since  protect  blocks  are  not 

nested,  ci  is  protect  free.  Further,  it  is  well  typed  under  7  by  rule  protect  since  c  is 
well  typed.  So  by  Mutual  Termination,  there  is  a  memory  v'  such  that  (cp  u)  — >*  v' 
and  Thus,  (protect  c\,v)  — >  v'  by  rule  atomicity. 

□ 

Now  we  wish  to  extend  the  Lock  Step  Execution  Lemma  to  probabilistic  states.  First, 
we  need  a  notion  of  equivalence  among  probabilistic  states.  The  basic  idea  is  that  two 
probabilistic  states  are  equivalent  under  7  if  they  are  the  same  after  any  high  variables  are 
projected  out.  Suppose,  for  example,  that  x  :  H  and  y  :  L.  Then 

f  (O,  [x  :=  0,y  :=  0])  :  1/3,  ] 

\  (O,  [x  :=  l,y  :=  0])  :  1/3,  \ 

[  (0',[x:=  0,2/  :=  1] )  :  1/3  J 

is  equivalent  to 

{(O,  [x  :=  2,  y  :=  0])  :  2/3,  (O',  [x  :=  3,  y  :=  1])  :  1/3}, 
because  in  each  case  the  result  of  projecting  out  the  high  variable  x  is 

{(O,[y:=0]):2/3,(O',[y:=l]):l/3}. 

Notice  that  projecting  out  high  variables  can  cause  several  configurations  to  collapse  into 
one,  requiring  summation  of  their  probabilities.  More  formally,  we  define  equivalence  as 
follows: 

Definition  5.4  Given  variable  typing  7  and  memory  /t,  let  /t7  denote  the  result  of  erasing 
all  high  variables  from,  p.  And  given  probabilistic  state  s,  let  the  projection  of  s  onto  the 
low  variables  of  7,  denoted  s7,  be  defined  by 

57(0,/i7)=  ^2  s(0,  u) 

Finally,  we  say  that  probabilistic  states  s  and  s'  are  equivalent  under  7,  written  s~7s',  if 
s  7  s  7 . 

A  probabilistic  state  s  is  well  typed  and  protected  under  7  if  for  every  global  config¬ 
uration  (0,i  1)  where  s(0,p)  >  0,  O  is  well  typed  and  protected  under  7,  and  dom(p)  = 
dom,(j). 

For  any  global  configuration  (0,/t),  the  point  mass  on  (0,p),  denoted  L(o,n)  1  is  the 
probabilistic  state  that  gives  probability  1  to  ( 0,p )  and  probability  0  to  all  other  global 
configurations. 

Now  we  can  show,  as  a  corollary  to  the  Lock  Step  Execution  Lemma,  that  ~7  is  a 
congruence  with  respect  to  the  transition  matrix  T  on  well-typed,  protected  point  masses. 
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Lemma  5.8  (Congruence  on  Point  Masses)  If  l  and  il  are  well-typed,  protected  point 
masses  such  that  t~7i/,  then 

Proof.  Since  t~yt',  there  must  exist  a  thread  pool  O  and  memories  p  and  v  such  that 
L  =  L(o,n)  i  *'  =  L(o,v)  >  and 

If  O  =  {  },  then  by  third  (global)  rule,  we  see  that  iT  =  t  and  i'T  =  i'.  So  tT~7t'T. 
Now  suppose  that  O  is  nonempty.  We  show  that  for  every  (O',  p')  where  (iT)(0',  p')  >  0, 
there  is  a  v'  such  that  p'~yv'  and  (iT)(0',  pi)  =  (l'T)(0' ,  v').  So  suppose  (O' ,  pi)  is  a  global 
configuration  and  (lT)(0' ,  p1)  >  0.  Since  i  is  a  point  mass, 

(iT)(0',p')  =  T((0,p),(0',p')) 

Therefore,  T((0,p ),  (O' ,p'))  >  0.  By  the  definition  of  T,  then,  T((0,p),  (O' ,p'))  =  1/|0| 
and  there  is  a  thread  a  and  command  c  such  that  0(a)  =  c  and  either 

1.  (c,p)  — >  (c',p')  and  O'  =  0[a  :=  c'],  or  else 

2.  (c,  p)  — >  pi  and  O'  =  O  —  a. 

In  the  first  case,  we  have,  by  the  Lock  Step  Execution  Lemma,  that  there  exists  v'  such 

that  (c,u)  — >  (c1  ,u')  and  //~7z/.  Then,  by  rule  (global),  (O,  v)  ^=Q(0[a  :=  c'],u'),  so 
by  definition  of  T, 

T((0,v),(0',v'))  =  l/\0\ 

But  i'  is  also  a  point  mass,  therefore 

(l'T)(0' ,  u')  =  T((0,  u),  (O' ,v')) 

Thus,  (lT)(0‘ ,  p')  =  1/| 0|  =  (i'T) (O',  u').  The  second  case  above  is  similar. 

So  for  a  given  configuration  ( 0,p ),  if  p^^u  and  (lT)(0,v)  >  0,  then  there  exists  v' 
such  that  z/~7i'  and  ( l'T)(0,v ')  =  (lT)(0,u)  from  above.  Since  //~7/t,  (l'T)(0,v')  must 
be  in  the  sum  (t7T)7(0,  p7).  Therefore,  (iT)y(0,  p7)  <  (l'T)1(0,  p7).  Symmetrically,  we 
have  (tT)1(0,p1)  >  (t'T)1(0,p1)  and  so  (iT)1  =  ( i'T )7,  or  iT~7//T.  □ 

Now  we  wish  to  generalize  the  above  Congruence  Lemma  from  point  masses  to  arbitrary 
probabilistic  states;  this  generalization  is  a  direct  consequence  of  the  linearity  of  T.  More 
precisely,  the  set  of  all  measures  forms  a  vector  space  if  we  define 

•  (5  +  s')(0,p)  =  s(0,p)  +  s'(0,p),  for  measures  s  and  s',  and 

•  (as)(0,p)  =  a(s(0,p)),  for  real  a  and  measure  s. 

With  respect  to  this  vector  space,  T  is  a  linear  transformation.  Furthermore,  ~7  is  a 
congruence  with  respect  to  the  vector  space  operations: 

Lemma  5.9  //sj~7s(  for  alii,  then 

aiS\  +  (12^2  T  +  '  '  '  ~7  +  02^2  +  (I3S3  +  •  •  • 
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Proof.  First,  we  have  that  projections  are  homomorphic, 

(fll-Sl  +  <32.S2  +  '  '  ')7  =  (<3lSl)7  +  ((3252)7  +  '  '  ' 

since  for  all  (0,p), 

(aisi  +  a2s2  4 - )7 (0,/r7)  =  ^2  ciisi(0,  u)  +  a2s2(0,  u)  H - 

=  ^2  a1s1(0,v1)+  ^2  a2s2{0,u2)-\ - 

=  (aisi)7(0,/i7)  +  (<32s2)7(0, //7)  H - 

=  ((«i5i)7  +  (a2s2)7  H - ){0,n1) 

Further,  (as)7  =  as7  since  for  all  (0,/j,), 

(as)j(O,  /i7)  =  ^  (as)(0,z3)=  ^  a(.s(0,;v))  =  a  ^  s{0,v)  =  as7(0,  p7) 

V^ry/J, 

Finally,  we  have 

+  a,2S2  +  •  •  -)7  =  (<3lSl)7  +  (u252)7  +  •  •  • 

=  <3iSi7  +  a2s2l  +  •  •  • 

=  al5l7  +  <32^27  +  •  •  • 

=  (ais/i)7  +  (a2s'2)7  +  •  •  • 

=  +  Cl2S2  +  •  ■  -)7 

So  ai«i  +  a2.s2  H -  ~7  aisj  +  a2S2  H - •  □ 

Theorem  5.10  (Probabilistic  Noninterference)  If  s  and  s'  are  well-typed,  protected 
probabilistic  states  such  that  s~7s',  then  sT~7s'T. 

Proof.  To  begin  with,  we  argue  that  s  and  s'  can  be  expressed  as  (possibly  countably 
infinite)  linear  combinations  of  (not  necessarily  distinct)  point  masses  such  that  the  corre¬ 
sponding  coefficients  are  the  same,  and  the  corresponding  point  masses  are  equivalent. 
Now,  we  know  that  we  can  express  s  and  s'  as  linear  combinations  of  point  masses: 

s  =  ci\L\  +  a2i2  +  <3333  +  •  •  • 

and 

s  =  bii,i  +  b2L2  +  63^3  +  •  •  • 

Assume,  for  now,  that  s7  (and  .s(. )  is  a  point  mass.  That  is,  ~7  ij  ~7  if-  ~7  3)  for  all  i 
and  j. 

Note  that  the  af  s  and  bfs  both  sum  to  1;  hence  they  both  can  be  understood  as 
partitioning  the  unit  interval  [0, 1]: 


<3l 

<32 

<33 

b% 

O- 

to 

b.t 

0  1 


18 


To  unify  the  coefficients  in  the  two  linear  combinations,  we  must  take  the  union  of  the  two 
partitions,  splitting  up  any  terms  that  cross  partition  boundaries.  For  example,  based  on 
the  picture  above  we  would  split  the  term  a  \  i \  of  s  into  b\i\  +  [a\  —  b\)i\ ,  And  we  would 
split  the  term  b2 4  of  s'  into  (oq  —  &i)4  +  (62  —  (tii  —  &i))4-  Continuing  in  this  way,  we  can 
unify  the  coefficients  of  s  and  s'. 

We  can  describe  the  splitting  process  more  precisely  as  follows.  We  simultaneously 
traverse  s  and  s',  splitting  terms  as  we  go.  Let  ai  and  bi'  be  the  next  terms  to  be  unified. 
If  a  =  b,  then  keep  both  these  terms  unchanged.  If  a  <  b,  then  keep  term  cll  in  s ,  but  split 
bi'  into  cll'  and  (6  —  a)i'  in  s'.  Handle  the  case  a  >  b  symmetrically.  If  one  or  both  of  the 
sums  are  infinite,  then  of  course  the  algorithm  gives  an  infinite  sum.  But  each  term  of  s 
and  of  s'  is  split  only  finitely  often  (otherwise  the  s  and  6,;’s  would  not  have  the  same 
sum)  with  one  exception — if  s  is  a  finite  sum  and  s'  is  an  infinite  sum,  then  the  last  term 
of  5  is  split  into  an  infinite  sum. 

So  far,  we  have  shown  how  to  unify  the  coefficients  of  s  and  s'  in  the  case  where  s7 
(and  s'7)  is  a  point  mass.  In  the  general  case,  s  and  s'  must  first  be  rearranged  into  sums 
of  sums  of  equivalent  point  masses: 

•S  =  (Ullill  +  «12H2  +  •••)  +  (a2H21  +  «22^22  +  •••)  +  ••• 


and 

s'  =  {biii'ii  +  bi2l'i2  +  •••)  +  (&2l4l  +  ^22^22  +  •••)  +  ••• 

where  1 ^  ~7  1$.  ~7  1'-  ~7  i'ik  for  all  i,  j,  and  k.  Also,  for  each  i,  ^2jCiij  =  X] jbij ■  Hence 
we  can  apply  the  algorithm  above  to  unify  the  's  with  the  bij' s,  the  (i2j’s  with  the  &2j’s, 
and  so  forth.  Then  we  can  form  a  single  sum  for  5  and  for  s'  by  interleaving  these  sums  in 
a  standard  way. 

The  final  result  of  all  this  effort  is  that  we  can  express  s  and  s'  as 

s  =  C\l'{  +  C2/-2  +  c3;'3  +  ■  ■  ■ 

and 

„/ _ „  ,///  1  „  ,in  1  „  in  1 

S  —  C1L1  +  C2^2  T  C3/.3  +  •  •  • 

where  1','  ~7  1'''  for  all  i.  Now,  since  T  is  a  linear  transformation,  we  have 

sT  =  ci{i'{T)  +  c2(4'T)  +  c3(4'T)  +  •  •  • 

and 

s’T  =  Cl(4"T)  +  c2(4"T)  +  c3(4"T)  +  •  •  • 

By  the  Congruence  on  Point  Masses  Lemma,  we  have  i''T  ~7  l'''T,  for  all  i.  So,  by 
Lemma  5.9,  sT  ~7  s'T.  □ 

6  Discussion 

The  need  for  a  probabilistic  view  of  security  in  nondeterministic  computer  systems  has 
been  understood  for  some  time  [17,  13].  Security  properties  (models)  to  treat  probabilistic 
channels  in  nondeterministic  systems  have  been  formulated  by  McLean[12]  and  Gray  [7,  8]. 
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It  is  important,  however,  to  recognize  that  these  efforts  address  a  different  problem  than 
what  we  consider  in  this  paper.  They  consider  a  computer  system  with  a  number  of  users , 
classified  as  high  or  low,  who  send  inputs  to  and  receive  outputs  from  the  system.  The 
problem  is  to  prevent  high  users,  who  have  access  to  high  information,  from  communicating 
with  low  users,  who  should  have  access  only  to  low  information.  What  makes  treating 
privacy  in  this  setting  especially  difficult  is  that  users  need  not  be  processes  under  control 
of  the  system — they  may  be  people ,  who  are  external  to  the  system  and  who  can  observe  the 
system’s  behavior  from  the  outside.  As  a  result,  a  high  user  may  be  able  to  communicate 
covertly  by  modulating  system  performance  to  encode  high  information  that  a  low  user  in 
turn  decodes  using  a  real-time  clock  outside  the  system.  Furthermore,  because  the  low  user 
is  measuring  real  time,  the  modulations  can  depend  on  low-level  system  implementation 
details,  such  as  the  paging  and  caching  behavior  of  the  underlying  hardware.  This  implies 
that  it  is  not  enough  to  prove  privacy  with  respect  to  a  high-level,  abstract  system  semantics 
(like  the  semantics  of  Figure  1).  To  guarantee  privacy,  it  is  necessary  for  the  system  model 
to  address  all  the  implementation  details. 

In  a  mobile-code  framework,  where  hosts  are  trusted,  ensuring  privacy  is  more  tractable. 
A  key  assumption  here  is  that  any  attempt  to  compromise  privacy  must  arise  from  within 
the  mobile  code,  which  is  internal  to  the  system.  As  a  result,  the  system  can  control  what 
the  mobile  code  can  do  and  what  it  can  observe.  For  example,  if  mobile-code  threads  are 
not  allowed  to  see  a  real-time  clock,  then  they  can  measure  the  timing  of  other  threads  only 
by  observing  variations  in  thread  interleavings.  Hence,  assuming  a  correct  implementation 
of  our  semantics,  threads  will  not  be  able  to  detect  any  variations  in  the  running  time 
of  a  protected  command,  nor  will  they  be  able  to  detect  timing  variations  due  to  low- 
level  implementation  details.  Consequently,  timing  attacks  are  impossible  in  well-typed, 
protected  programs  in  our  language.  For  instance,  Kocher  describes  a  timing  attack  on  RSA 
[10].  Basically,  he  argues  that  an  attacker  can  discover  a  private  key  x  by  observing  the 
amount  of  time  required  by  several  modular  exponentiations  yx  mod  n.  In  our  framework, 
one  would  use  a  protected  for  loop  to  implement  modular  exponentiation,  which  means  that 
no  useful  timing  information  about  exponentiation  would  be  available  to  other  threads — it 
would  always  appear  to  execute  in  exactly  one  step. 

7  Other  related  research 

Other  work  in  secure  information  flow,  in  a  parallel  setting,  includes  that  of  Andrews  and 
Reitman  [1],  Melliar-Smith  and  Moser  [14],  Focardi  and  Gorrieri  [5,  6],  and  Banatre  and 
Bryce  [2],  Melliar-Smith  and  Moser  consider  covert  channels  in  Ada.  They  describe  a  data 
dependency  analysis  to  find  places  where  Ada  programs  depend  on  the  relative  timing  of 
operations  within  a  system.  Andrews  and  Reitman  give  an  axiomatic  flow  logic  for  treating 
information  flow  in  the  presence  of  process  synchronization.  They  also  sketch  how  one  might 
treat  timing  channels  in  the  logic.  Banatre  and  Bryce  give  an  axiomatic  flow  logic  for  CSP 
processes,  also  treating  information  flow  arising  from  synchronization.  None  of  these  efforts, 
though,  gives  a  satisfactory  account  of  the  security  properties  that  they  guarantee.  Focardi 
and  Gorrieri  identify  trace-based  and  bisimulation-based  security  properties  for  systems 
expressed  in  an  extension  of  Milner’s  CCS,  which  they  call  the  Security  Process  Algebra. 
These  properties,  however,  are  possibilistic  in  nature  (e.g.  a  system  is  SNNI  [6]  if  the  set  of 
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traces  that  a  low  observer  can  see  of  a  system  is  possible  regardless  of  whether  high-level 
actions  are  enabled  or  disabled  in  the  system). 


8  Conclusion 

So  what  is  the  significance  of  our  result?  It  depends  on  what  can  be  observed.  With 
respect  to  internal  program  behavior,  our  Probabilistic  Noninterference  result  rules  out 
all  covert  flows  from  high  variables  to  low  variables.  But  if  external  observation  of  the 
running  program  is  allowed,  then  of  course  covert  channels  of  the  kind  discussed  in  Section  6 
remain  possible.  Note,  however,  that  the  mobile  code  setting  affords  us  more  control  over 
external  observations  than  would  normally  be  possible.  When  we  execute  some  mobile 
code  on  our  machine,  we  can  limit  communication  with  the  outside  world,  preventing 
precise  observations  of  a  program’s  behavior  such  as  its  running  time.  Depending  on  the 
application,  one  can  build  enough  noise  into  the  mobile  code’s  interface  with  the  outside  in 
various  ways  to  significantly  reduce  the  capacity  of  an  externally-observable  timing  channel. 
See,  for  example,  the  NRL  Pump  for  secure  acknowledgment  [9]. 
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